43 research outputs found
Training a Feed-forward Neural Network with Artificial Bee Colony Based Backpropagation Method
Back-propagation algorithm is one of the most widely used and popular
techniques to optimize the feed forward neural network training. Nature
inspired meta-heuristic algorithms also provide derivative-free solution to
optimize complex problem. Artificial bee colony algorithm is a nature inspired
meta-heuristic algorithm, mimicking the foraging or food source searching
behaviour of bees in a bee colony and this algorithm is implemented in several
applications for an improved optimized outcome. The proposed method in this
paper includes an improved artificial bee colony algorithm based
back-propagation neural network training method for fast and improved
convergence rate of the hybrid neural network learning method. The result is
analysed with the genetic algorithm based back-propagation method, and it is
another hybridized procedure of its kind. Analysis is performed over standard
data sets, reflecting the light of efficiency of proposed method in terms of
convergence speed and rate.Comment: 14 Pages, 11 figure
Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding
In this letter, we propose a vocal tract length (VTL) perturbation method for
text-dependent speaker verification (TD-SV), in which a set of TD-SV systems
are trained, one for each VTL factor, and score-level fusion is applied to make
a final decision. Next, we explore the bottleneck (BN) feature extracted by
training deep neural networks with a self-supervised objective, autoregressive
predictive coding (APC), for TD-SV and compare it with the well-studied
speaker-discriminant BN feature. The proposed VTL method is then applied to APC
and speaker-discriminant BN features. In the end, we combine the VTL
perturbation systems trained on MFCC and the two BN features in the score
domain. Experiments are performed on the RedDots challenge 2016 database of
TD-SV using short utterances with Gaussian mixture model-universal background
model and i-vector techniques. Results show the proposed methods significantly
outperform the baselines.Comment: Copyright (c) 2021 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification
There are a number of studies about extraction of bottleneck (BN) features
from deep neural networks (DNNs)trained to discriminate speakers, pass-phrases
and triphone states for improving the performance of text-dependent speaker
verification (TD-SV). However, a moderate success has been achieved. A recent
study [1] presented a time contrastive learning (TCL) concept to explore the
non-stationarity of brain signals for classification of brain states. Speech
signals have similar non-stationarity property, and TCL further has the
advantage of having no need for labeled data. We therefore present a TCL based
BN feature extraction method. The method uniformly partitions each speech
utterance in a training dataset into a predefined number of multi-frame
segments. Each segment in an utterance corresponds to one class, and class
labels are shared across utterances. DNNs are then trained to discriminate all
speech frames among the classes to exploit the temporal structure of speech. In
addition, we propose a segment-based unsupervised clustering algorithm to
re-assign class labels to the segments. TD-SV experiments were conducted on the
RedDots challenge database. The TCL-DNNs were trained using speech data of
fixed pass-phrases that were excluded from the TD-SV evaluation set, so the
learned features can be considered phrase-independent. We compare the
performance of the proposed TCL bottleneck (BN) feature with those of
short-time cepstral features and BN features extracted from DNNs discriminating
speakers, pass-phrases, speaker+pass-phrase, as well as monophones whose labels
and boundaries are generated by three different automatic speech recognition
(ASR) systems. Experimental results show that the proposed TCL-BN outperforms
cepstral features and speaker+pass-phrase discriminant BN features, and its
performance is on par with those of ASR derived BN features. Moreover,....Comment: Copyright (c) 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
An Improved Gauss-Newtons Method based Back-propagation Algorithm for Fast Convergence
The present work deals with an improved back-propagation algorithm based on
Gauss-Newton numerical optimization method for fast convergence. The steepest
descent method is used for the back-propagation. The algorithm is tested using
various datasets and compared with the steepest descent back-propagation
algorithm. In the system, optimization is carried out using multilayer neural
network. The efficacy of the proposed method is observed during the training
period as it converges quickly for the dataset used in test. The requirement of
memory for computing the steps of algorithm is also analyzed.Comment: 7 pages, 6 figures,2 tables, Published with International Journal of
Computer Applications (IJCA
Data augmentation enhanced speaker enrollment for text-dependent speaker verification
Data augmentation is commonly used for generating additional data from the
available training data to achieve a robust estimation of the parameters of
complex models like the one for speaker verification (SV), especially for
under-resourced applications. SV involves training speaker-independent (SI)
models and speaker-dependent models where speakers are represented by models
derived from an SI model using the training data for the particular speaker
during the enrollment phase. While data augmentation for training SI models is
well studied, data augmentation for speaker enrollment is rarely explored. In
this paper, we propose the use of data augmentation methods for generating
extra data to empower speaker enrollment. Each data augmentation method
generates a new data set. Two strategies of using the data sets are explored:
the first one is to training separate systems and fuses them at the score level
and the other is to conduct multi-conditional training. Furthermore, we study
the effect of data augmentation under noisy conditions. Experiments are
performed on RedDots challenge 2016 database, and the results validate the
effectiveness of the proposed methods
A Deep Learning Approach to Detect Lean Blowout in Combustion Systems
Lean combustion is environment friendly with low NOx emissions and also
provides better fuel efficiency in a combustion system. However, approaching
towards lean combustion can make engines more susceptible to lean blowout. Lean
blowout (LBO) is an undesirable phenomenon that can cause sudden flame
extinction leading to sudden loss of power. During the design stage, it is
quite challenging for the scientists to accurately determine the optimal
operating limits to avoid sudden LBO occurrence. Therefore, it is crucial to
develop accurate and computationally tractable frameworks for online LBO
detection in low NOx emission engines. To the best of our knowledge, for the
first time, we propose a deep learning approach to detect lean blowout in
combustion systems. In this work, we utilize a laboratory-scale combustor to
collect data for different protocols. We start far from LBO for each protocol
and gradually move towards the LBO regime, capturing a quasi-static time series
dataset at each condition. Using one of the protocols in our dataset as the
reference protocol and with conditions annotated by domain experts, we find a
transition state metric for our trained deep learning model to detect LBO in
the other test protocols. We find that our proposed approach is more accurate
and computationally faster than other baseline models to detect the transitions
to LBO. Therefore, we recommend this method for real-time performance
monitoring in lean combustion engines